Back

Journal of Clinical and Translational Science

Cambridge University Press (CUP)

Preprints posted in the last 90 days, ranked by how well they match Journal of Clinical and Translational Science's content profile, based on 11 papers previously published here. The average preprint has a 0.07% match score for this journal, so anything above that is already an above-average fit.

1
Career mentoring matters: A multi-component program for early-stage HIV investigators at the University of California, San Francisco

Fuchs, J. D.; Melo, J. S.; Sauceda, J. A.; Watabe, J.; Sterling, L.; Johnson, M. O.; Gandhi, M.

2026-03-02 medical education 10.64898/2026.02.24.26346718
Top 0.1%
163× avg
Show abstract

BackgroundEvidence supports the key role research mentors play in bolstering the success of early stage investigators (ESI). However, there are limited data about the impact of supplemental, cross-disciplinary career mentorship and professional development opportunities for ESIs seldom included during academic training. We assessed the perceived value of this approach among post-doctoral fellows and early career faculty who participated in a multi-component career mentoring program organized by the University of California, San Francisco Center for AIDS Research (UCSF CFAR). MethodsWe surveyed past program participants (2005-2020), assessing demographics, current career status, perceived impact of the program, and feedback on program elements. We performed thematic analysis on open-ended responses to explore program benefits. ResultsOf 146 program participants contacted, 102 responded (70% response rate). Over two thirds (65%) were female, and 38% self-identified as underrepresented minority (URM) investigators. A majority of respondents now dedicate >70% of their time to research. All would recommend the program to ESI colleagues, and over 80% reported that their CFAR mentors influenced their career trajectories in several ways, including help with grant writing, linkage to researchers sparking new collaborations, and support through personal challenges or navigating conflict with primary research mentors. While 90% of URM ESIs valued advice from CFAR mentors, only a third reported receiving specific support around challenges faced as minoritized investigators. ConclusionsA career mentoring program designed to complement the support offered by research mentors positively influenced the career trajectory of ESIs. Focused efforts are needed to support URM investigators who face ongoing structural barriers to success in academic settings.

2
Motivators and Barriers to PA Preceptorship in North Carolina

Stabingas, K.; Gerstner, L.; Rachis, S.

2026-02-17 medical education 10.64898/2026.02.16.26346405
Top 0.1%
137× avg
Show abstract

IntroductionPhysician assistant (PA) programs face persistent challenges in recruiting and retaining clinical preceptors due to time constraints, administrative burden, lack of compensation, and limited training. Additional pressures, such as health care consolidation, program expansion, clinician burnout, and financial implications of paid clinical sites, further strain preceptorship capacity. This study examines motivators and barriers influencing clinicians willingness to precept PA students. MethodsThis mixed-methods study used snowball sampling to recruit current, former, and non-precepting PAs across North Carolina. Participants completed surveys with Likert-scale and open-ended items adapted from the 2011 National Survey of Physician Assistants. Four virtual focus groups, selected from survey respondents, underwent semi-structured interviews informed by Self-Determination Theory (SDT). Quantitative data were analyzed using descriptive statistics and ordinal logistic regression; qualitative data underwent thematic analysis with deductive SDT coding and inductive refinement. Triangulation integrated findings. ResultsRespondents (N = 158) represented diverse clinical experience. Top motivators included student quality (66%), program support (53%), and financial compensation (51%). Key barriers were student quality (61.29%), burnout (53.23%), and lack of compensation (46.77%). From the focused group discussion, four themes emerged: Student Quality, Financial Compensation, Non-Financial Incentives, and Administrative Support. Student preparedness acted as both motivator and barrier; compensation concerns focused on fairness. DiscussionPreceptorship relies on relational and professional factors, student quality, recognition, and institutional alignment, rather than financial incentives alone. System inefficiencies, inadequate preparation, and misaligned compensation hinder engagement. Improving student readiness, enhancing institutional support, and implementing transparent, layered incentives may strengthen recruitment and retention.

3
Enhancing competency in clinical trials management: Findings from a multicountry trial coordinators interventional training program

Ejigu, D. A.; Fekadu, A.; Makonnen, E.; Conradie, A.; Okech, B.; Lehrman, J.; Birhane, R.; Vahedi, M.; Manyazewal, T.

2026-03-04 medical education 10.64898/2026.03.03.26347517
Top 0.1%
120× avg
Show abstract

BackgroundClinical research coordinators play a crucial role in ensuring the scientific rigor, regulatory compliance, and operational integrity of clinical trials. However, in Africa, they often lack access to structured, competency-based training, especially in operational, regulatory, and trial management domains. This study evaluated the effectiveness of a comprehensive training intervention designed to standardize and enhance core competencies of clinical trial coordinators. MethodsWe conducted a prospective pre-post interventional study among cohorts of clinical research professionals completing a 10-week, internationally-accredited, Moodle-based clinical trial operations training program aligned with the Joint Task Force Core Competency Framework, covering 10 lessons and 25 domains. Self-reported competence was evaluated at baseline and post-training. Data analyses included paired t-tests for aggregate scores, McNemars exact test for domain-level proportions, multivariable logistic regression for predictors of improvement, and Cohens d for effect size. ResultsAmong the 166 participants enrolled from 19 African countries and completed the pre-training survey, 152 who completed the program and post-training survey were included. The training significantly increased the mean aggregate competence from 12.24{+/-}7.85 (out of a maximum of 25) to 23.35{+/-}2.73 (mean difference: 11.11; 95% CI 9.86-12.36; p<0.001; Cohens d=1.41). Score variance decreased, with the median score increasing from 12.0 (IQR: 6.0-19.0) to 24.5 (IQR: 23.0-25.0). All 25 domains improved (p<0.001), with the largest gains in complex, low-baseline domains: managing external partners (+59.2%), project management (+58.6%), financial management (+55.3%), and trial close-out (+57.2%). (+57.2%). Ethical principles and informed consent that had high baseline competence reached near-universal levels at 99.3% and 98.7%, respectively. No differences were observed by country or gender (p>0.05). ConclusionStructured, competency-based training strengthens clinical trial coordinators capabilities, particularly in technical and administrative domains that are often overlooked. Accredited, framework-aligned clinical trial training programs promote consistent trial quality, strengthen research capacity, and sustain excellence in clinical trial delivery. WHAT IS ALREADY KNOWN ON THIS TOPIC- Clinical research coordinators play a crucial role in ensuring the scientific rigor, regulatory compliance, and operational integrity of clinical trials WHAT THIS STUDY ADDS- The study evaluated the effectiveness of a comprehensive training intervention designed to standardize and enhance core competencies of clinical trial coordinators in Africa, where they often lack access to structured, competency-based training HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY- This study should encourage the design and delivery of internationally-accredited, Moodle-based clinical trial operations training programs in Africa that enhance clinical trial competency.

4
Volunteering at a Student-Run Clinic and Matching into Primary Care Specialties

Brock, D. C.; Kumar, A.; Engebretson, H.; Grant, S.; Khan, Z.; Kontoyiannis, P. D.; DiLeo, M. J.; Kamepalli, S.; Joe, M. K.; Peoples, N.; Altman, M. A.; Pillow, M. T.; Clark, D. L.

2026-01-24 medical education 10.64898/2026.01.22.26344668
Top 0.1%
117× avg
Show abstract

BackgroundStudent-run clinics (SRC) serve a unique role in healthcare by addressing the needs of underserved communities while providing medical students hands-on learning experiences. The Houston Outreach Medicine Education and Social Services (HOMES) Clinic is a SRC and a program of Healthcare for the Homeless - Houston that provides medical care to individuals experiencing unstable housing in Houston, Texas. Amid a growing shortage of primary care physicians in the United States, understanding factors that influence specialty choice is critical. This study aimed to explore whether volunteering at HOMES Clinic is associated with an increased likelihood of matching into primary care specialties. MethodsThis study used a retrospective cohort design of HOMES Clinic volunteers from 2014-2025. Students who volunteered at HOMES Clinic represented the exposure group (n=1,157), while non-volunteers served as the reference group (n=3,666). The primary outcome was the association between volunteering and matching into primary care specialties. Secondary outcomes included residency program rank, in-state residency placement, and induction into the Alpha Omega Alpha and Gold Humanism Honor Societies. ResultsHOMES Clinic volunteers matched into primary care specialties at a 7.5% higher rate than non-volunteers (p=1.3x10-5; OR=1.38; 95% CI = 1.20-1.59). Conversely, HOMES volunteers showed a 5.3% lower proportion of students who matched into surgical specialties (p=8.9x10-4; OR=0.76; 95% CI = 0.65-0.89). Volunteers also showed a modest association with matching into higher-ranked residency programs (p<0.05) and had 25% higher odds of Alpha Omega Alpha induction and 41% higher odds of Gold Humanism Honor Society induction. ConclusionsVolunteering at HOMES Clinic showed a positive association with matching into primary care specialties. This trend likely reflects both self-selection of students interested in primary care and the influence of SRC experience on shaping student residency specialty choices. Our results provide insights into how medical schools and SRCs foster the development of the next generation of primary care physicians.

5
Student Scholarly Research Programs in US Medical Schools: Cross-sectional Web Audit

Lee, D.; Lee, C.; Oh, S. S.; Lee, K.; Hyun, C. S.; Shin, J. I.; An, S.; Ioannidis, J.

2026-03-04 medical education 10.64898/2026.03.03.26347497
Top 0.1%
113× avg
Show abstract

BackgroundParticipating in research during medical school is supported by institutional programs and may influence subsequent professional development. ObjectiveWe aimed to describe the current status and heterogeneity of scholarly research programs for medical students in the United States, including expectations, support, and key structural features. MethodsWe conducted a cross-sectional web audit of official webpages for all accredited US MD- and DO-granting medical schools (search performed September 2024 to January 2025). Extracted variables included participation requirements, mentorship, timing and duration (overall and dedicated research time), expected scholarly outputs, funding sources, stipend information, and stated program goals. We compared Carnegie tier R1 (Very high research activity) versus other institutions, QS Top-50 versus other institutions, and MD versus DO schools using {chi}2/Fisher exact tests for 2x2 tables and exact trend or Freeman-Halton tests for multicategory variables. ResultsPrograms were identified for all 202 institutions. Funding was explicitly mentioned by 61.9% (125/202) of programs, 27.0% (51/189) were compulsory, 98.9% (188/190) reported faculty mentorship, and 91.0% (171/188) were exclusive for medical students. Program duration, dedicated time, expected outcomes, stipend reporting, funding sources, and stated goals varied widely. Carnegie R1 institutions had longer duration (P=.002) and tended to report external funding more often than other institutions (25/104, 24.0% vs 9/98, 9.2%; OR 3.13, 95% CI 1.38-7.10; P=.008). QS Top-50 institutions were more likely to require compulsory participation than other institutions (11/19, 57.9% vs 40/170, 23.5%; OR 4.47, 95% CI 1.68-11.87; P=.003). No significant differences were observed between MD and DO programs across most measured characteristics. ConclusionsScholarly research programs for medical students are ubiquitous across US medical schools but heterogeneous in structure, expectations, and support. Research-intensive and top-ranked institutions may have more external funding and sometimes may put together longer and compulsory programs Further evaluation of student experiences and outcomes is warranted.

6
Can AI Match Human Experts? Evaluating LLM-Generated Feedback on Resident Scholarly Projects

van Allen, Z.; Forgues-Martel, S.; Venables, M. J.; Ghanney, Y.; Villeneuve, A.; Dongmo, J.; Ahmed, M.; Archibald, D.; Jolin-Dahel, K.

2026-03-04 medical education 10.64898/2026.03.04.26346878
Top 0.1%
110× avg
Show abstract

BackgroundDelivering timely, high-quality feedback on resident scholarly projects is labour-intensive, especially in large programmes. We developed an AI-assisted evaluation system, powered by the open-weight LLaMA-3.1 large-language model (LLM), to generate formative feedback on Family Medicine residents scholarly projects and compared its performance with expert human evaluators. MethodsWe evaluated whether the AI-generated feedback achieves comparable quality to expert feedback. The tool ingests heterogeneous resident submissions (PDFs, scans, photographs) via OCR and produces section-by-section feedback aligned with programme rubrics. In a three-phase study we evaluated 240 feedback reports (Short, Question and Timeline, Final; n = 80 each). Within each phase, 40 reports were AI-generated and 40 produced by research experts across four project types: Quality Improvement, Survey-Based, Research, and Literature Review. Blinded raters used a 25-item survey across five constructs: understanding & reasoning, trust & confidence, quality of information, expression style & persona, safety & harm. ResultsSurvey reliability was high across phases ( = .71-.98). Human feedback generally out-scored AI. In short reports, humans led on quality (Mean {+/-} SD; 4.14 {+/-} 0.57 vs 3.09 {+/-} 1.05) and trust (3.96 {+/-} 0.71 vs 2.78 {+/-} 1.15). In final reports, differences become small for quality (4.09 {+/-} 0.65 vs 3.49 {+/-} 0.68) and persona (4.16 {+/-} 0.40 vs 3.91 {+/-} 0.50), while AI was preferred for safety (4.50 {+/-} 0.60 vs 4.36 {+/-} 0.56). Performance varied by project type: in survey-based final reports the AI led on quality (4.28 {+/-} 0.50 vs 3.98 {+/-} 0.44) and safety (4.58 {+/-} 0.40 vs 4.24 {+/-} 0.67), whereas in quality-improvement short reports humans were markedly superior in reasoning (4.27 {+/-} 0.68 vs 2.33 {+/-} 1.00). ConclusionsAn open-weight LLM with curated prompts can generate rubric-aligned feedback at scale that approaches the quality of expert human feedback. While expert feedback remained superior overall, AI surpassed humans in selected contexts and safety assessments. Performance of the tool will increase over time as newer and more capable open-weight models are released. Our code and systems prompts are open source.

7
You cant manage what you cant imagine: The Digital Health Checklist-Risk Management (DHC-RM) Tool to enhance participant protections in digital health research

Card, A. J.; Vital, D.; Nebeker, C.

2026-02-24 health policy 10.64898/2026.02.22.26346854
Top 0.1%
108× avg
Show abstract

Digital health technologies are powerful-enhancing data collection, participant engagement, and personalized health interventions-yet their rapid proliferation has outpaced guidance for research participant protection. Current practice assists researchers in identifying risks but provides limited support for comprehensive risk management. To address this gap, we developed the Digital Health Checklist-Risk Management (DHC-RM) Tool, which integrates the established Digital Health Checklist with approaches from safety risk management. We conducted a study (n=40) comparing the DHC-RM Tool with current practice using a randomized experimental difference-in-differences design. Primary outcomes were the quantity, variety, and novelty of risks identified; secondary outcomes were the same constructs applied to risk control development. Compared with current practice, use of the DHC-RM Tool resulted in dramatically improved performance across all primary outcomes. Users identified on average 14.7 additional risks (compared to baseline) versus 0.26 in the control group and a higher number of risks in each of six pre-identified risk domains. Half of all distinct risks identified in the comparison phase were identified exclusively using the tool. The tool also improved risk control design, producing 9.63 additional risk control strategies per participant compared with 0.15 for current practice and yielding substantially greater novelty and variety. User feedback was also positive: 75% of participants reported they would use the tool again, citing its structured workflow, just-in-time examples, improved insight into risks, and its value for IRB communication. Suggestions for refinement focused primarily on expanding training examples and providing additional support for risk control development. The DHC-RM Tool significantly improves risk management practice in digital health research. By embedding structured, ethics-informed risk management into digital health research design, the DHC-RM Tool has the potential to improve participant protection while also streamlining ethics approval. Author SummaryDigital health research can put participants (and others) at risk in ways that dont always occur to the researchers who are designing a study. Researchers also face challenges in prioritizing risks and coming up with ideas to reduce those risks. We developed a new approach, the Digital Health Checklist - Risk Management Tool (DHC-RM Tool), to give researchers the support they need to identify, assess, and address research participant risks in this fast-moving field. Our experimental study found that use of the DHC-RM Tool led to a very large improvement in how well researchers managed the risks of digital health research studies. Using the toolkit, they were able to identify more risks than they identified using current practice-including risks they would not otherwise have considered. They were also able to come up with more changes to reduce the risks associated with digital health research studies, including changes they would not otherwise have considered. Those who used the toolkit found it beneficial and easy to use. The DHC-RM Tool fills an important gap in the science and practice of participant protection in digital health research.

8
Randomized controlled trials claiming "personalized", "individualized" and "precision" interventions: characteristics, transparency and bias

Russo, L.; Lentini, N.; Soru, L.; Pastorino, R.; Boccia, S.; Ioannidis, J.

2026-02-12 medical education 10.64898/2026.02.09.26345904
Top 0.1%
97× avg
Show abstract

The terms personalized, individualized and precision medicine are increasingly used to describe health interventions, yet their operational meaning in clinical research remains unclear. Despite extensive conceptual discussion, there is limited empirical evidence on how these labels are applied in randomized controlled trials (RCTs) and whether such trials meet standards of transparency and methodological rigor. We systematically examined 262 RCTs published between 2020 and 2022 that used the terms "personalized", "individualized", or "precision" in the title to describe an intervention. The term "personalized" was used most frequently (49.2%), followed by "individualized" (45.8%) and "precision" (5.0%). In most trials, personalization involved behavioral, digital, or pharmacological interventions, with few studies employing -omics approaches. Personalization was most often based on individual lifestyle factors, psychological characteristics, or disease classification. We also found that in most trials, personalization consisted of tailoring a single intervention to individuals (82.8%), often through individualized dosage (73.2%). Most included RCTs were judged to be at high risk of bias and showed limited transparency with respect to data and code sharing. Our study suggests that, in contemporary RCTs, the labels "personalized", "individualized", and "precision" are applied interchangeably to a wide range of heterogeneous interventions that are predominantly non-genomic. Greater conceptual clarity and stronger methodological standards are needed to ensure that claims of personalization in clinical research are empirically meaningful and reliable.

9
Team-Based Learning Versus Lecture-Based Instruction for Chest Radiograph Interpretation in Physician Associate Education: A Quasi-Experimental Study

Kehrli, K. F.; Conner, K. R.; Eyadiel, L.; Sisson, C. B.; Smith, N.

2026-02-24 medical education 10.64898/2026.02.20.26346418
Top 0.1%
79× avg
Show abstract

BackgroundChest radiograph interpretation is a foundational skill in physician associate (PA) education, and competence in diagnostic imaging is an accreditation standard. While a larger body of research on radiology education exists in undergraduate medical education, considerable variability in instructional approaches limits clear conclusions regarding the most effective method. Growing evidence supports the use of active learning strategies in radiology instruction. However, little published research specifically addresses radiology education within PA programs. Team-Based Learning (TBL), an active learning approach grounded in social constructivism that emphasizes preparation, collaboration, and application, may be well suited to teaching image interpretation. This study evaluates the effectiveness of TBL compared with traditional lecture-based instruction for chest radiograph interpretation. MethodsA mixed-methods, quasi-experimental cohort comparison using a pre-post design was conducted with two consecutive PA student cohorts at a single institution. One cohort received a 90-minute lecture-based session; another cohort participated in a 90-minute TBL session. Academic performance was assessed using validated pre- and post-tests. Student satisfaction and self-efficacy were evaluated using post-session surveys derived from the Kirkpatrick model and Banduras self-efficacy theory. Independent sample t-tests compared quantitative outcomes, and qualitative responses were analyzed thematically. ResultsBoth cohorts demonstrated improvement in chest radiograph interpretation scores, with no statistically significant differences between groups in post-test performance or score improvement (p = 0.841). Survey results indicated favorable perceptions of both instructional approaches. The TBL cohort reported significantly higher ratings for engagement and peer interaction (p = <0.001). Self-efficacy ratings were higher among TBL participants for selected confidence-related items (p=0.003, p = 0.021, p = <0.001). Qualitative responses on what contributed most to self-efficacy emphasized peer discussion in the TBL group and structured explanations in the lecture group. ConclusionsTBL produced academic performance comparable to lecture-based instruction while supporting greater learner engagement and confidence. These findings support TBL as a feasible instructional approach for chest radiograph interpretation in PA education.

10
Figure accessibility for readers with colour vision deficiency: analysis of leading medical journals

Albany-Ward, K.; Wu, Y.

2025-12-15 medical education 10.64898/2025.12.11.25342084
Top 0.2%
76× avg
Show abstract

Colour vision deficiency (CVD) affects up to 8% of males and 0.5% of females and currently lacks effective treatments. Individuals with CVD require visually accessible environments to flourish, the absence of which can cause unacceptable disparities in academic achievement and career progression. For current and aspiring clinician-scientists with CVD, inaccessible colour figures in scientific publications can hinder understanding of key information and potentially cause patient harm. Therefore, our work characterises the accessibility of major medical journal figures for individuals with CVD using established guidelines and provides recommendations for improving figure accessibility. We observed that among 138 journal figures evaluated from nine leading medical journals, 107 (80%) failed to conform with colour contrast and labelling requirements from the Web Content Accessibility Guidelines (WCAG), indicating that readers with CVD may be unable to perceive displayed information. 215 of 395 (55%) sub-figures within the aforementioned figures were judged to be completely inaccessible to individuals with protan or deutan deficiencies, the commonest CVD subtypes. Despite universal publisher declarations of compliance with WCAG directives, no journals figures were fully compliant with these guidelines. Our findings demonstrate the need for urgent action by authors and publishers to augment the colour contrast of journal figures and add secondary labels to enable their comprehension by audiences with CVD. We believe that these design changes will, too, improve the clarity of figures for general audiences.

11
Teaching Ultrasound Early: Outcomes from a Student-Led POCUS Elective Course

Suri, I.; Parkas, N.; Solazzo, E.; Yu, R.; Moehrle, N.

2026-01-19 medical education 10.64898/2026.01.16.26344280
Top 0.2%
76× avg
Show abstract

BackgroundPoint-of-care ultrasound (POCUS) utilization across specialties continues to grow, making it a valuable skill for medical students. Early exposure to ultrasound may enhance students clinical reasoning, anatomical understanding, and integration of POCUS into the physical exam. This study evaluates the impact of a student-organized, physician-taught POCUS elective course on pre-clinical medical students competence in foundational ultrasound skills. MethodsFifteen students were enrolled in the course. Students attended four weekly 90-minute sessions focused on a unique organ system. Students took a 19-question test before and after the course to assess overall learning. Five-question quizzes were conducted before and after each session to evaluate immediate learning. Fishers exact test was used to compare correct vs. incorrect quiz answers before and after each session. ResultsOverall knowledge of POCUS, determined by the 19-question quiz, improved from 61.5% to 76.8% (p = 0.01). FAST and cardiac ultrasound quiz scores improved from 54.2% to 91.4% (p < 0.01) and 57.7% to 85.7% (p < 0.01), respectively. The pre- and post-quiz score for the abdominal ultrasound exam remained the same at 80.0% (p = 1). The gynecologic ultrasound exam score improved from 45.0% to 66.7% (p = 0.3). ConclusionsThis elective course significantly increased pre-clinical medical students knowledge of ultrasound. No statistically significant difference was noted for the abdominal or gynecologic sessions.

12
Evolution of Reporting P-values Across the Biomedical Literature, 1990-2025: an Updated Meta-Research Study

Choi, J.; Lee, K.; Chavalarias, D.; Shin, J. I.; Ioannidis, J.

2026-01-16 medical education 10.64898/2026.01.14.26344149
Top 0.2%
73× avg
Show abstract

ImportanceOver several decades there have been extensive debates on the use and misuse of statistical significance. It would be important to capture what P-values are reported in biomedical papers and whether their patterns have changed over time. ObjectiveTo quantify the reporting dynamics on P-values in biomedical articles in PubMed and PubMed Central(PMC) database over a 35-year period (1990-2025). DesignData were retrieved from the National Library of Medicine via PubMed and PubMed Central (PMC, full-text articles included), fetching the entire accessible corpus. Records were computationally processed using a regular expression algorithm, validated for various mathematical formats, to extract reported P-values from the text. SettingThe study includes 22,734,796 PubMed abstracts, 6,031,459 PMC abstracts and 6,397,787 PMC full-texts. Main Outcomes and MeasuresProportion of article reporting at least 1 P-value, at least 1 P-value less than thresholds (.05 and .005), distribution of P-values by magnitude and operator type. ResultsThe proportion of articles reporting P-values increased from 7.5% in 1990 to 18.3% in 2025 for PubMed abstracts, and from 5.2% to 53.3% for PMC full-texts. The median number of P-values per article increased from 2 to 7 in PMC full-text articles, with upward trends observed in all databases. A high proportion of P-values remains clustered around .05 and .001 in all databases. The proportion of articles reporting at least one P-value [&le;] .05 has remained in the range 94%-98% since 1998, while the proportion reporting at least one P-value[&le;] .005 has increased over time, reaching 57.0% for PubMed abstracts and 62.5% for PMC full-texts. The reporting of exact P-values increased until 2015, but with no further increase in the last 10 years (PubMed abstracts: 17.6% in 1990, 51.1% in 2015, 49.8% in 2025) Conclusions and RelevanceOur evaluation demonstrates the pervasive entrenchment of P-values, despite heavy debates and major changes in the content of the biomedical literature over time. More P-values are reported and papers using P-values almost always report some that are statistically significant. Readers should remain aware of the major issues surrounding P-value misuse and misinterpretation. Key pointsO_ST_ABSQuestionC_ST_ABSWith continuing debate regarding the use and misuse of statistical significance, how have reported P-values evolved over the past 35 years? FindingsAcross over 22 million PubMed abstracts and over 6 million full-texts, reporting of P-values became more common over time. Almost all (94-98%) abstracts and full-text reporting P-values have at least one significant at the .05 threshold. The reporting of exact P-values increased until 2015 but plateaued since then. Clustering around traditional statistical significance thresholds remains consistent. MeaningP-values reporting has become more common over time, with pervasive prevalence of significant P-values across the biomedical literature.

13
Women-related cardiovascular research funding under Canada's Sex- and Gender-Based Analysis policy from 2000 to 2024: an interrupted time series analysis

Chen, N.; Kendall, T.; Zhang, W.; Brotto, L. A.

2026-01-16 health policy 10.64898/2026.01.14.26344152
Top 0.2%
68× avg
Show abstract

BackgroundCardiovascular disease is the leading cause of death among women, yet women have historically been underrepresented in cardiovascular research. In Canada, sex- and gender-based analysis (SGBA) policies were introduced to address these gaps, including mandatory applicant-level requirements in 2010 and reviewer-level guidance in 2018. How these policies have influenced funding allocation for women-related cardiovascular research remains unclear. ObjectiveTo examine long-term trends in CIHR investment in women-related cardiovascular research and assess changes associated with SGBA policy milestones. MethodsWe conducted a longitudinal study using Canadian Institutes of Health Research (CIHR) funding data from fiscal year 2000-2001 to 2024-2025. Women-related cardiovascular research projects were identified through terminology searches in funded project titles, keywords, and abstracts. Annual fiscal-year proportions of cardiovascular research funding allocated to women-related research were analysed using segmented regression, with interruption points in 2010 and 2018. ResultsAmong 17,168 cardiovascular research related grant annual records, 11.33% were classified as women-related projects. These projects accounted for 11.85% of total CIHR cardiovascular research funding over the 25-year period. Funding increased before 2010, declined between 2011 and 2017, and accelerated after 2018. Segmented regression showed a small immediate increase in 2010, followed by a negative post-2010 trend and a significant positive quadratic trend after 2018. ConclusionsApplication-level SGBA requirements had limited influence on investment patterns, whereas the 2018 reviewer-level guidance aligns with the subsequent acceleration in women-related cardiovascular research funding. Strengthening SGBA implementation, expanding targeted funding opportunities, and improving monitoring of sex-and-gender-disaggregated outputs may help address persistent gaps in women-related cardiovascular research.

14
Information Leaflets vs Artificial Intelligence: Comparing Perceptions of Stroke Survivors and Professionals in a Mixed-methods Study

Tvrda, L.; Burton, J. K.; McConnell, K.; Mavromati, K.; Knoche, H.; Mikulik, R.; Quinn, T. J.

2026-02-01 medical education 10.64898/2026.01.26.26344610
Top 0.2%
68× avg
Show abstract

BackgroundStroke survivors often describe problems of insufficient access to information post-discharge. Traditional resources may not meet their information needs, but Artificial Intelligence (AI) could play a role. AimsTo compare user perceptions of stroke information from third sector stroke websites with that generated by AI and summarize the attributes of the preferred stroke information formats. MethodsUK third sector stroke websites were searched for materials relevant to 15 questions commonly asked by stroke survivors. ChatGPT(-4o) was used to generate responses to these questions. Stroke professionals (clinicians, researchers), stroke survivors, and their caregivers reviewed third sector and AI responses, indicating the source of the response and their preferred text. Participants also rated responses on scales of empathy, trustworthiness, reliability, comprehensibility and usefulness, and provided free text comments. Proportions of preference and correctly guessed responses, as well as mean ratings, were compared between the groups. Framework analysis was used to identify the attributes of response formats preferred by stroke survivors. ResultsRelevant responses were found for 13 (87%) out of 15 questions. Across groups, 60 participants with mean age of 44 (SD=14) and 57% females, correctly identified 184/300(61%) of AI responses, and preferred AI response in 123/300(46%) of the cases. The groups differed in their preference with clinicians being least likely to choose AI (34%), followed by stroke survivors (49%) and researchers (54%). All groups viewed third sector responses as more empathetic in tone. The themes of content, structure, and tone of responses were described by stroke survivors with the emphasis on clarity, conciseness, and approachable tone. ConclusionsAI-generated responses to stroke questions were rated positively by stroke survivors and researchers, whereas stroke clinicians were more sceptical. Given that stroke information materials are intended for people with lived experience of stroke, their input should be prioritized to inform development of new resources.

15
Outcome Orientation vs Problem Orientation: Preliminary Validation of a Novel Cognitive Assessment Tool and Its Relationship to Burnout in Advanced Practice Providers

Cartner, B. W.; Schmauss, S.; Bucala, M.; Ghim, M. Y.; Guerrini, J.

2026-03-02 medical education 10.64898/2026.02.20.26346714
Top 0.2%
65× avg
Show abstract

BackgroundAdvanced Practice Providers (APPs) in emergency and urgent care settings experience high burnout rates, yet limited research examines cognitive factors influencing professional fulfillment. The Empowerment Dynamic framework suggests outcome-oriented thinking may protect against burnout compared to problem-oriented patterns. ObjectiveTo examine relationships between cognitive mindset orientation, professional fulfillment, and burnout among APPs while providing preliminary validation of a novel cognitive assessment instrument. MethodsCross-sectional survey of licensed APPs working in emergency departments and urgent care facilities across two health systems (July-October 2025). Professional fulfillment and burnout were measured using the Stanford Professional Fulfillment Index; cognitive orientation was assessed using a newly developed 22-item instrument. ResultsAmong 98 respondents (19.5% response rate), mean professional fulfillment was 5.8 and mean burnout was 4.5; 40.8% met burnout criteria. Professional fulfillment and burnout were inversely correlated (r = -0.62; P < .001). Problem orientation correlated positively with burnout (r = 0.56) and negatively with fulfillment (r = -0.36), while outcome orientation showed opposite patterns (burnout: r = -0.57; fulfillment: r = 0.44). In multivariable models, outcome orientation remained independently associated with lower burnout ({beta} = -1.51; P = .003) and higher fulfillment ({beta} = 1.73; P = .002). ConclusionsCognitive mindset orientation is associated with burnout and professional fulfillment among APPs. The novel assessment instrument demonstrates acceptable psychometric properties. Future longitudinal studies are needed to establish causality and evaluate cognitive interventions for burnout prevention.

16
Development and Validation of CPX-MATE: An End-to-End Medical Education Platform Integrating Voice-Based Virtual Patient Simulation and Automated Real-time Evaluation

Song, J. W.; Kim, M.; Hong, C.; Kim, Y. S.; Cho, J.; Kim, J. H.; Myung, J.; Choi, A.; Yoon, H.; Lee, S. G. W.; You, S. C.; Park, C.

2026-02-25 medical education 10.64898/2026.02.21.26346803
Top 0.3%
49× avg
Show abstract

BackgroundObjective Structured Clinical Examination (OSCE; Clinical Performance Examination [CPX] in South Korea) is a high-stakes assessment of clinical performance, communication, and reasoning during time-limited patient encounters. As AI-enabled virtual standardized patient (VSP) simulation and automated scoring are introduced for OSCE-like training, prospective evidence is needed on how such systems perform and are perceived when embedded in real educational workflows. MethodsWe developed CPX with Medical students Assistant for Training and Evaluation (CPX-MATE), a web-based platform integrating (1) CPX with Virtual Standardized Patient (CPX-VSP), real-time voice dialogue with a VSP using speech-to-speech (STS) models, and (2) CPX with Real-Time Evaluator (CPX-RTE), automated transcription, checklist-based scoring, and feedback from encounter audio using a Speech-to-Text model and a large language model. During an emergency medicine clerkship (Nov 2025-Jan 2026), 60 senior medical students completed two 12-min CPX encounters (VSP with acute pancreatitis; HSP with ureteral stone) with immediate CPX-RTE feedback. For CPX-VSP, students were assigned to either a full-capacity or a resource-limited STS configuration (n=30 each). Dialogue fidelity was evaluated by turn-by-turn analysis of student-VSP exchanges, classifying responses into clinically meaningful error types (tangential, oversharing, role-breaking, off-script). CPX-RTE performance was assessed by agreement (Gwets AC1) with professor real-time and resident video-based ratings using a 45-item checklist. Usability of CPX-VSP and CPX-RTE, with overall system usability scale (SUS), were surveyed, and mean per-session costs for CPX-VSP and CPX-RTE were calculated. ResultsAcross 3,282 dialogue turns, overall error rates were 1.77% versus 9.43% for full-capacity versus resource-limited STS configurations (p<0.001), driven by fewer tangential and oversharing responses; no off-script errors were observed. The mean per-session cost was $0.12 for resource-limited configuration and $0.78 for full-capacity configuration. CPX-RTE showed high agreement with human ratings (AC1=0.916 vs professor; 0.916 vs resident), with slightly different levels of agreement across four sections, and high usability across all domains (mean scores, 4.65-4.92), with a per-session cost of $0.17. CPX-MATE demonstrated good overall usability (median [IQR] of 77.5 [70.0-85.0]). ConclusionsEmbedded within a prospective clinical clerkship, CPX-MATE demonstrated operational fidelity and human-level checklist agreement as an end-to-end, voice-based AI-assisted OSCE platform. This real-world deployment supports its scalable integration as a complementary assessment tool while highlighting the importance of systematic validation and context-aware implementation in medical education.

17
Telehealth consultation in medical education: a mixed methods study of medical students' and educators' experiences

Wetzlmair, L.-C.; O'Malley, A.; O'Carroll, V.

2025-12-14 medical education 10.64898/2025.12.10.25342002
Top 0.3%
48× avg
Show abstract

BackgroundThe COVID-19 pandemic accelerated the adoption of telehealth in healthcare delivery and education. However, the integration of telehealth into medical curricula is often limited to institutional initiatives and remains scattered. This raises questions about the experienced barriers and enablers for education in telehealth consultation. ObjectiveThis study sought to explore medical students and educators experiences with telehealth consultations in academic and clinical learning environments during the pandemic in the United Kingdom. MethodsA cross-sectional mixed-methods design was employed. Quantitative data was gathered via online questionnaires, while qualitative insights were derived from semi-structured interviews. A total of 248 participants (153 students, 87 educators) responded to the survey, of those, 23 participants (13 students, 10 educators) were interviewed. ResultsTelehealth consultations were primarily taught and delivered during clinical placements. The pandemic and subsequent educational adjustments were identified as enabling factors. Exposure to telehealth increased students confidence and readiness, particularly through active participation in consultations. Barriers included inadequate technical infrastructure, lack of confidential spaces, and limited confidence of educators teaching telehealth consultation. ConclusionThe findings highlight the need for an integrated telehealth curriculum that incorporates practical exposure and addresses technical limitations in both universities as well as healthcare settings.

18
Financial Outcomes and Community Benefit in the 340B Program: Comparing 340B and Non-340B Hospitals

Popovian, R.; Sydor, A. M.; Czubaruk, K.; Walker, M.; Smith, W.

2026-02-17 health policy 10.64898/2026.02.12.26346191
Top 0.3%
46× avg
Show abstract

BackgroundThe 340B Drug Pricing Program was established to expand access to care for low-income and uninsured patients by allowing safety-net hospitals and clinics to purchase outpatient drugs at discounted prices. Over time, the program has expanded substantially, raising questions about whether participating hospitals are meeting the programs intended objectives. MethodsUsing 2023 hospital financial data from the RAND Corporation, we conducted cross-sectional descriptive comparisons of 340B and non-340B hospitals nationwide. Key measures included charity care as a percentage of operating expenses, Medicaid admissions as a share of hospital days, uncompensated care, and costs associated with uninsured patients approved for charity care. Subgroup analyses also examined the performance of Disproportionate Share Hospitals (DSH), Critical Access Hospitals (CAH), Rural Referral Centers (RRC), Sole Community Hospitals (SCH), and National Cancer Institute (NCI) designated hospitals. ResultsAmong 3,999 hospitals analyzed, 340B hospitals provided, on average, lower levels of charity care than non-340B hospitals (2.16% vs. 2.82% of operating expenses) and lower costs of charity care for uninsured patients (1.60% vs. 2.26%). However, 340B hospitals served a higher proportion of Medicaid patients (19.69% vs. 17.76%). Substantial variation was observed across 340B subcategories: DSH hospitals reported the highest Medicaid utilization, while CAH hospitals reported the lowest levels of charity care and Medicaid days. ConclusionsParticipation in the 340B program does not uniformly correlate with greater provision of charity care or uncompensated care. These findings suggest a misalignment between program intent and outcomes and support the need for greater transparency, standardized eligibility criteria, and minimum charity care requirements to ensure that 340B savings directly benefit underserved populations.

19
Historical Perspectives in Medicine using a Large Language Model: Emulating an 18th Century Physician

Malladi, P.; Eaton, J.; Gleichgerrcht, E.; Chatzistamou, I.; Roark, K.; Kennedy, S. W.; Bonilha, L.

2026-02-12 medical education 10.64898/2026.02.10.26345990
Top 0.3%
46× avg
Show abstract

IntroductionEighteenth-century medical texts document a formative period in the evolution of clinical reasoning, yet their integration into modern medical education is limited. The traditional approach to learning the history of medicine has naturally focused on passive reading, but new approaches using AI could enable learners to interrogate and simulate the historical diagnostic logic and therapeutic paradigms. More specifically, large language models (LLMs) offer an opportunity to create interactive simulations that allow experiential engagement with historical medical reasoning. MethodsWe developed a historically constrained LLM-based educational platform designed to emulate the diagnostic reasoning, language, and conceptual frameworks of an 18th-century physician. A modern GPT architecture was customized using strict instruction-based constraints and limited exclusively to a curated corpus of six foundational 17th- 18th century medical texts. Guardrails were implemented to prevent anachronistic terminology and modern medical concepts. Model outputs were evaluated qualitatively by comparing the models diagnoses and treatment plans with published diagnoses and treatment from original 18th century sources. We also applied the simulation to modern clinical vignettes for an illustrative contrast between modern and 18th century approaches. ResultsThe model generated responses that closely aligned with 18th-century medical and rhetorical style, as well as therapeutic reasoning. When presented with historical cases, the simulation demonstrated strong concordance with original diagnoses and management strategies. Secondly, when applied to modern cases, the model described period-appropriate reasoning, highlighting clear contrasts with contemporary biomedical reasoning. ConclusionsAI broadly, and more specifically LLMs configured as historically constrained simulators, can function as effective tools for learning in medical history. This approach could enable active engagement with historical clinical reasoning, fostering critical reflection on the contingent and evolving nature of medical knowledge. Such temporal simulations hold promise for medical humanities education and interdisciplinary teaching.

20
Hair Cortisol as a Marker of Physiologic Stress in Residency Training

Hinz, L. E.; Lithgow, K. A.; Kunimoto, K. A.; Kline, G. A.

2026-01-19 medical education 10.64898/2026.01.16.26344232
Top 0.3%
44× avg
Show abstract

Structured AbstractO_ST_ABSBackgroundC_ST_ABSHair cortisol analysis allows assessment of long-term cortisol exposure and may provide insight into chronic hypothalamic-pituitary-adrenal activation in medical residents and residency on-call responsibilities. ObjectiveTo determine the hair cortisol concentration(HCC) representing 3 months of medical residency and secondarily, its association with various on-call models (in-hospital, night float, home call and no call). DesignCross-sectional study of 66 medical residents who were recruited to provide hair samples collected after a three-month block in medical residency. SettingAcademic, tertiary health care system. ParticipantsVolunteer sample of first through third year medical and primary care residents. Exposure3 cm of hair was divided into 3 segments of 1 cm each; each segment represented 1 month of cumulative cortisol production. Main Outcome MeasureHCC results were compared to a published, cortisol assay-specific normative population reference interval. HCC results were interpreted according to a priori categorizations of moderate (+1.5SD), considerable (+2SD) or extreme (> +3SD) HCC elevations. Associations with various on-call models were an exploratory secondary outcome. ResultsThe median age was 28 (26-30) years with median sleep duration of 2 hours on in-hospital call. 40% of trainees had at least one HCC segment above the threshold deemed marked elevation. Median HCC was significantly higher for in-hospital and night float vs. no call (285 ng/g and 335 ng/g vs 78 ng/g p<0.05) and approached significance compared to home call (190 ng/g, p= 0.06). Conclusions and RelevanceWe have described chronic exposure to endogenous cortisol in medical residency. Nearly half of trainees experienced at least one month of severe hypothalamic-pituitary-adrenal axis activation in a 3-month timeframe; many had marked chronic cortisol elevations across the entire 3 month observation frame. HCC was higher in months where in-hospital on-call was required. This may have implications for long-term health of trainees and raises questions about the structure of duty hours and sequence of care acuity blocks within residency training programs.